| Non-Rationalised Economics NCERT Notes, Solutions and Extra Q & A (Class 9th to 12th) | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9th | 10th | 11th | 12th | ||||||||||||||||
Chapter 7 Correlation
Introduction to Correlation
Introduction
In previous chapters, you learned how to summarize data into a single representative value (central tendency) and measure its variability (dispersion). Now, you will learn how to examine the relationship between two variables.
We often observe relationships in our daily lives. For example, as the temperature rises, ice-cream sales increase. As the supply of a vegetable increases in the market, its price tends to drop. Correlation analysis is a statistical tool for systematically examining such relationships. It helps answer questions like:
- Is there any relationship between two variables?
- If the value of one variable changes, does the value of the other also change?
- Do both variables move in the same direction or in opposite directions?
- How strong is the relationship?
What Does Correlation Measure?
Correlation studies and measures the direction and intensity of the relationship between two or more variables.
Correlation Measures Covariation, Not Causation
It is crucial to understand that correlation measures covariation (how two variables move together), not causation (that one variable causes the other to change). The presence of correlation simply means that when one variable changes, the other variable also changes in a definite way.
Some relationships may have a cause-and-effect interpretation (e.g., low rainfall is related to low agricultural productivity). However, other relationships may be a mere coincidence. For example, a relationship between the arrival of migratory birds and local birth rates cannot be given a cause-and-effect interpretation. Sometimes, a third variable might be influencing both variables, creating a spurious correlation (e.g., rising temperature causes both an increase in ice-cream sales and an increase in drowning deaths, but ice-cream sales do not cause drowning).
For simplicity, we will assume that the correlation, if it exists, is linear, meaning the relationship can be represented by a straight line on a graph.
Types of Correlation
Correlation is commonly classified into positive and negative correlation, based on the direction of the relationship between the variables.
1. Positive Correlation
The correlation is said to be positive when the two variables move together in the same direction. If one variable increases, the other variable also increases, and if one decreases, the other also decreases.
Examples:
- As income rises, consumption also rises.
- As temperature rises, the sale of ice cream also rises.
2. Negative Correlation
The correlation is said to be negative when the two variables move in opposite directions. If one variable increases, the other variable decreases, and vice-versa.
Examples:
- As the price of a commodity falls, its demand increases.
- When you spend more time studying, the chances of failing decline.
Techniques for Measuring Correlation
Three important tools are used to study and measure correlation: Scatter Diagrams, Karl Pearson’s Coefficient of Correlation, and Spearman’s Rank Correlation.
1. Scatter Diagram
A scatter diagram is a graphical technique that visually presents the nature of the association between two variables without providing a specific numerical value. In this technique, the values of the two variables are plotted as points on a graph paper.
The overall pattern of the plotted points gives a good idea of the nature and intensity of the relationship:
- If the points are scattered around an upward rising line, it indicates a positive correlation.
- If the points are scattered around a downward sloping line, it indicates a negative correlation.
- If there is no discernible pattern, it indicates no correlation.
- If all the points lie exactly on a line, the correlation is perfect (perfect positive or perfect negative).
2. Karl Pearson’s Coefficient of Correlation (r)
Also known as the product-moment correlation coefficient, this method gives a precise numerical value of the degree of linear relationship between two variables, X and Y. It should only be used when the scatter diagram suggests a linear relationship.
Properties of the Correlation Coefficient (r)
- No Unit: 'r' is a pure number and has no unit of measurement.
- Indicates Direction: A negative value of 'r' indicates an inverse (negative) relationship. A positive value of 'r' indicates a direct (positive) relationship.
- Range: The value of 'r' lies between -1 and +1 (i.e., $-1 \leq r \leq +1$).
- Indicates Strength:
- If r = +1, the correlation is perfect positive.
- If r = -1, the correlation is perfect negative.
- If r = 0, there is no linear correlation.
- A value of 'r' close to +1 or -1 indicates a strong linear relationship.
- A value of 'r' close to 0 indicates a weak linear relationship.
- Unaffected by Origin and Scale: The value of 'r' is unaffected by a change of origin and change of scale.
Formula for Karl Pearson's Coefficient
The correlation coefficient 'r' is calculated as the covariance of X and Y divided by the product of their standard deviations.
$r = \frac{\text{Cov}(X,Y)}{\sigma_x \sigma_y} = \frac{\sum(X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum(X - \bar{X})^2 \sum(Y - \bar{Y})^2}}$
Where $\bar{X}$ and $\bar{Y}$ are the means of X and Y, and $\sigma_x$ and $\sigma_y$ are their standard deviations.
3. Spearman’s Rank Correlation ($r_k$)
Developed by C.E. Spearman, this method measures the linear association between the ranks assigned to individual items according to their attributes, rather than their actual values.
When to Use Rank Correlation
- When variables are attributes that cannot be numerically measured but can be ranked (e.g., beauty, honesty, intelligence).
- When the relationship between variables is clearly non-linear but its direction is consistent.
- When the data contains extreme values (outliers), as the rank correlation is not affected by them.
Formula for Spearman's Rank Correlation
The formula for Spearman's rank correlation is:
$r_k = 1 - \frac{6 \sum D^2}{n(n^2 - 1)}$
Where 'D' is the difference between the ranks of the two variables for each observation, and 'n' is the number of observations.
If ranks are repeated, a correction factor needs to be applied to the formula.
The interpretation of $r_k$ is the same as that of Karl Pearson's 'r'.
Conclusion
This chapter has discussed various techniques for studying the relationship between two variables, with a focus on linear relationships. The scatter diagram offers a visual presentation of the relationship and is not limited to linear associations. For a numerical measure of the linear relationship, Karl Pearson’s coefficient of correlation is used when variables are precisely measured. When variables are attributes or cannot be measured precisely, Spearman’s rank correlation can be used.
It is crucial to remember that these measures do not imply causation; they only indicate covariation. The knowledge of correlation provides a valuable understanding of the direction and intensity of change in one variable when the other correlated variable changes. This understanding is fundamental to many areas of economic analysis and policy-making.
NCERT Questions Solution
Question 1. The unit of correlation coefficient between height in feet and weight in kgs is
(i) kg/feet
(ii) percentage
(iii) non-existent
Answer:
Question 2. The range of simple correlation coefficient is
(i) 0 to infinity
(ii) minus one to plus one
(iii) minus infinity to infinity
Answer:
Question 3. If $r_{xy}$ is positive the relation between X and Y is of the type
(i) When Y increases X increases
(ii) When Y decreases X increases
(iii) When Y increases X does not change
Answer:
Question 4. If $r_{xy}$ = 0 the variable X and Y are
(i) linearly related
(ii) not linearly related
(iii) independent
Answer:
Question 5. Of the following three measures which can measure any type of relationship
(i) Karl Pearson’s coefficient of correlation
(ii) Spearman’s rank correlation
(iii) Scatter diagram
Answer:
Question 6. If precisely measured data are available the simple correlation coefficient is
(i) more accurate than rank correlation coefficient
(ii) less accurate than rank correlation coefficient
(iii) as accurate as the rank correlation coefficient
Answer:
Question 7. Why is r preferred to covariance as a measure of association?
Answer:
Question 8. Can r lie outside the –1 and 1 range depending on the type of data?
Answer:
Question 9. Does correlation imply causation?
Answer:
Question 10. When is rank correlation more precise than simple correlation coefficient?
Answer:
Question 11. Does zero correlation mean independence?
Answer:
Question 12. Can simple correlation coefficient measure any type of relationship?
Answer:
Question 13. Collect the price of five vegetables from your local market every day for a week. Calculate their correlation coefficients. Interpret the result.
Answer:
Question 14. Measure the height of your classmates. Ask them the height of their benchmate. Calculate the correlation coefficient of these two variables. Interpret the result.
Answer:
Question 15. List some variables where accurate measurement is difficult.
Answer:
Question 16. Interpret the values of r as 1, –1 and 0.
Answer:
Question 17. Why does rank correlation coefficient differ from Pearsonian correlation coefficient?
Answer:
Question 18. Calculate the correlation coefficient between the heights of fathers in inches (X) and their sons (Y)
| X | 65 | 66 | 57 | 67 | 68 | 69 | 70 | 72 |
| Y | 67 | 56 | 65 | 68 | 72 | 72 | 69 | 71 |
Answer:
Question 19. Calculate the correlation coefficient between X and Y and comment on their relationship:
| X | –3 | –2 | –1 | 1 | 2 | 3 |
| Y | 9 | 4 | 1 | 1 | 4 | 9 |
Answer:
Question 20. Calculate the correlation coefficient between X and Y and comment on their relationship
| X | 1 | 3 | 4 | 5 | 7 | 8 |
| Y | 2 | 6 | 8 | 10 | 14 | 16 |
Answer: